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The present invention relates generally to phase-locked loops, and more 
specifically to linear full-rate phase detectors and clock and data recovery circuits. 

Data networking has exploded over the last several years, and has changed the 
way people work, get information, and spend leisure time. Local Area Networks (LANs) in 
the workplace allow for centralized database and file sharing and archiving. Wireless 
Application Protocol (WAP) enabled mobile phones operating over a Wide Area Network 
(WAN) allow users to access news updates and stock quotes. The Internet has transformed 
shopping and research, and has spawned a new recreational activity - Web surfing. Many 
computers are used primarily as interfaces to these networks, thus the expression "the 
network is the computer" has become popularized. 


and hubs move data between users, between users and servers, or between servers. Data 
moves over a variety of media such as fiber optic or twisted pair cables, and the air. These 
media are similar in that they distort data, making it difficult to be read by a receiving device. 
Light-waves in a fiber optic cable travel not only down the cable's core, but bounce off the 
core-cladding interface, and thus tend to disperse. Twisted pair cables have filtering 
properties that tend to attenuate higher frequencies. This limited bandwidth also creates 
interference between individual data bits, known as Inter-Symbol Interference (ISI). 
Wireless signals tend to bounce off buildings and other surfaces in a phenomenon known as 
multipath, which results in the smudging of one data bit into the next. 


receive distorted data and must "clean it up", or retime it, for use either by the device itself, a 
device attached to it, or for re-transmission. A useful building block for this is the phase- 


BACKGROUND OF THE INVENTION 


Devices such as Network Interface Cards (NICs), bridges, routers, switches, 


Therefore, each of these devices, NICs, bridges, routers, switches, and hubs, 



locked loop (PLL). PLLs accept distorted data, and provide a CLOCK signal and retimed (or 
recovered) data as outputs. 

But the task for PLLs has lately begun to be a lot tougher. Equipment 
operating at data rates of one Gigabit per second is replacing 100 Megabit devices, which 
5 recently replaced 10 Megabit units. Exacerbating this problem is the competitive nature of 
the networking business itself. Pricing pressures are enormous, and using high speed, 
specialized processes raises system costs. Thus, the goal is to create integrated circuits that 
are capable of operating at these data rates, but which can be made using relatively 
inexpensive process technologies. What is needed are PLLs which can be made 
10 inexpensively, while still operating at these high frequencies. 

5 SUMMARY OF THE INVENTION 

,,E Accordingly, the present invention provides a phase detector having relaxed 

lS'2 timing requirements that allow the use of less costly processes. Specifically, the insertion of 

1:0 a delay element in a phase detector consistent with the present invention separates the signal 

if 

j;3 paths for error and reference signal generation. In the absence of the delay, data must be 

; "I transferred from one flip-flop to another in one-half a clock period. With the addition of a 

in delay approximately equal to one-half a clock cycle, the transfer has almost an entire clock 

= IS 2 

2(£l period in which to occur. In addition, another flip-flop is added to accommodate the timing 
requirements and to provide better matching of the critical high speed signals. 

An exemplary embodiment of the present invention provides a method 
including receiving the data signal having a first data rate and receiving a clock signal having 
a first clock frequency, and alternating between a first level and a second level. The data 

25 signal is stored when the clock signal alternates from the first level to the second level, and 
the stored data signal is provided as a first signal a first amount of time later. The first signal 
is stored when the clock signal alternates from the first level to the second level, and the 
stored first signal is provided as a second signal a second amount of time later. A third signal 
is provided by delaying the first signal for a third amount of time. The third signal is stored 

30 when the clock signal alternates from the second level to the first level, and the stored third 
signal is provided as a fourth signal a fourth amount of time later. A fifth signal is provided 
by delaying the data signal a fifth amount of time. An error signal is generated by taking the 
exclusive-OR of the first and fifth signals; and a reference signal is generated by taking the 
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exclusive-OR of the second and fourth signals. The first data rate is equal to the first clock 
frequency. 

A further embodiment of the present invention provides an apparatus for 
recovering data from a received data signal. The apparatus includes a first storage device 
5 configured to generate a first signal by receiving and storing the received data signal, a 
second storage device configured to generate a second signal by receiving and storing the 
first signal, and a first delay block configured to generate a third signal by delaying the first 
signal. This embodiment also provides for a third storage device configured to generate a 
fourth signal by receiving and storing the third signal, a second delay block configured to 
10 generate a fifth signal by delaying the received data signal, a first logic gate configured to 

perform an exclusive-OR of the second and fourth signals, and a second logic gate configured 
iip to perform an exclusive-OR of the first and fifth signals. When the first storage device stores 
]■;□ the received data, the second storage device stores the first signal, and the third storage 
% device does not store the third signal. When the third storage device stores the third signal, 
1 the first storage device does not store the received data, and the second storage device does 
not store the first signal. 
;;r: Yet a further exemplary embodiment of the present invention provides an 

apparatus for recovering data from a received data signal. The apparatus includes a first flip- 
}'3 flop having a data input coupled to a first data input port, and a clock input coupled to a first 
20^ clock port, a second flip-flop having a data input coupled an output of the first flip-flop, and a 
clock input coupled to the first clock port; and a first delay element having an input coupled 
to the output of the first flip-flop. This embodiment also provides a third flip-flop having a 
data input coupled to an output of the first delay element, and a clock input coupled to a 
second clock port, as well as a second delay element having an input coupled to the first data 
25 input port. A first exclusive-OR gate having a first input coupled to the output of the second 
flip-flop, and a second input coupled to an output of the third flip-flop, and a second 
exclusive-OR gate having a first input coupled to the output of the first flip-flop and a second 
input coupled the second delay element, are also included. The signal at the second clock 
port is the complement of the signal at the first clock port. 
30 A better understanding of the nature and advantages of the present invention 

may be gained with reference to the following detailed description and the accompanying 
drawings. 
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BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 is a block diagram of an exemplary optical transceiver that 
incorporates one embodiment of the present invention; 
5 Figure 2 is a block diagram of a clock and data recovery circuit consistent with 

one embodiment of the present invention; 

Figure 3 illustrates a block diagram a full-rate phase detector consistent with 
one embodiment of the present invention; 

Figure 4 is a schematic of a flip-flop which may be used in the full-rate phase 
1 0 detector of Figure 3 ; 

Figure 5 is a schematic of a delay block which may be used in the full-rate 
Q phase detector of Figure 3; 

\% Figure 6 is a schematic of an XOR gate which may be used in the full-rate 

] % phase detector of Figure 3; 

15F Figure 7 is a generalized timing diagram a phase detector consistent with one 

i,p embodiment of the present invention; 

IU Figure 8 illustrates the timing diagram of figure 7 with a specific data pattern, 

I'M and no phase error; 

\ j\ Figure 9 is the timing diagram of Figure 8 with a phase error introduced; 

2CH Figure 10 shows the error and reference voltages as a function of phase error 

for the full-rate phase detector of Figure 3; and 

Figure 1 1 is a flowchart of a method of recovering data and clock signals 
consistent with the present invention. 

25 DESCRIPTION OF THE SPECIFIC EMBODIMENTS 

Figure 1 is an exemplary block diagram of an optical transceiver which 
incorporates one embodiment of the present invention. This figure, as with all the included 
figures, is for illustrative purposes, and does not limit the possible applications of the present 
30 invention, or limit the appended claims. This optical transceiver may be on a NIC card with a 
media access controller, some memory, and other circuits. Included is a receive path 
including a photo diode 110, sensing resistor 112, pre-amplifier 120, amplifier 130, DC offset 
correction circuit 150, clock and data recovery circuit 140, and link and data detect 160. A 
transmit path having an amplifier 170, Light Emitting Diode (LED) driver 180, multiplexer 
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175, oscillator 185, and LED 190 is also shown. Instead of the LED driver 180 and LED 
190, the light emitting subsystem may also consist of a laser driver and laser diode. 

A receive fiber optic cable 105 carries an optical data signal to the reversed- 
biased photo diode 110. Photo diode 110 senses the amount of light from fiber optic cable 
5 105, and a proportional leakage current flows from the device cathode to anode. This current 
flows though sense resistor 112, thus generating a voltage. This voltage is amplified by pre- 
amplifier 120, and sent to amplifier 130. DC offsets are reduced by DC correction circuit 
150. The output of the amplifier 130 drives the clock and data recovery circuits 140, as well 
as the link and data detect block 160. The clock and data recovery circuits extract the 
10 CLOCK signal embedded in the data provided on line 135 by the amplifier, and uses it to 

retime the data for output on lines 143. If the link and data detect block 160 senses either a 
gk data or link signal at the data line 135, a valid link signal is asserted on line 167. If the link 

and data detect block 160 senses a data signal at the data line 135, a receive squelch signal is 
-p de-asserted on line 163. 
l&jL Transmit data is provided on line 173 to amplifier 170. Amplifier 170 is 

iM enabled by the transmit enable signal on line 177. When amplifier 170 is enabled, transmit 
□ data is passed to the multiplexer 175. Multiplexer 175 passes the transmit data to the LED 
j,i driver 180 which in turn generates a current through light emitting diode (LED) 190. When 
■■;L! current is driven through LED 190, light is emitted and transmitted on fiber optic cable 195. 
2dU When the LED driver 180 is not driving current though LED 190, the LED is off, and the 

fiber optic cable 195 is dark. If the amplifier 170 is disabled, multiplexer 175 selects the idle 
signal from oscillator block 185. Oscillator block 185 provides an idle signal through the 
multiplexer 175 to the LED driver 180. This idle signal is used by the receiver to ensure that 
a valid optical connection has been made at both ends of the fiber-optic cable 105. 
25 As discussed above, the physical media limitations distort the received signal. 

Moreover, the delay through the amplifier 170, multiplexer 175, LED driver 180, and LED 
190 may not be the same for a light-to-dark as for a dark-to-light transition. This mismatch 
causes what is referred to as a duty cycle distortion. Further, electrical noise in the power 
supply and data path create jitter and phase noise, which is where the delay through the 
30 transmitter changes as a function of time. It is the function of clock and data recovery 

circuits, such as block 140, to retime the data so it is in a more useable form for further data 
processing, and provide a CLOCK synchronized to the data. 

Figure 2 is a simplified block diagram of a clock and data recovery circuit, 
also known as a phase-locked loop, consistent with one embodiment of the present invention. 
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This architecture is shown for exemplary purposes, and does not limit either the possible 
applications of the present invention, or the appended claims. Other architectures will be 
readily apparent to those skilled in the art. For example, the retiming block 210 may be 
included in the phase detector 220. Further the phase detector 220 and frequency detector 
5 230 may be the same circuit under the control of a mode switch. Included in this figure are 
retiming block 210, phase detector 220, frequency detector 230, loop filter 240, and VCO 
250. 

At startup, the loop adjusts the VCO to the correct frequency. Startup may be 
initiated by the power supply turning on, by the reception of a valid link by the receiver, or 
10 other appropriate event. A reference clock is provided on lines 235 to the frequency detector 

230. The reference clock is a comparatively low-frequency signal generated by a stable 
- J .iP oscillation source, for example a crystal. The output of the VCO 250, the CLOCK signal on 
:|n lines 255, is typically divided down by an integral number and compared to the reference 

clock by the frequency detector 230. The CLOCK signal may be single-ended or differential. 
\3 k If the CLOCK signal is single-ended, lines 255 are simply one line. The output of the 
, *" frequency detector 230 provides an output voltage which is filtered by the loop filter 240, and 
!;r{ provided to the VCO 250 as tuning voltage VTUNE 245. If the frequency of the CLOCK 
; fi = signal on lines 255 is too high, the frequency detector 230 changes its output voltage, and 
td VTUNE on line 245, in such a direction as to lower the CLOCK signal's frequency. 
2(!r Conversely, if the CLOCK signal on lines 255 is too low in frequency, the frequency detector 
230 changes its output voltage, and VTUNE on lined 245, in such a direction as to raise the 
CLOCK signal's frequency. 

Once the CLOCK signal on lines 255 is tuned to the correct frequency, the 
phase detector 220 becomes active, and the frequency detector 230 becomes inactive. A 
25 DATA signal is received by the data retiming block 210 and phase detector 220 on lines 205. 
The DATA signal may be single-ended or differential. If the DATA signal is single-ended, 
line 205 is simply one line. Phase detector 220 compares transitions in the DATA signal on 
lines 205 to the rising edges of the CLOCK signal on lines 255, and produces an ERROR 
signal on line 222 that is proportional to the phase relationship between them. Alternately, 
30 the phase detector 220 can be designed so that the transitions in the DATA signal are 

compared to the falling edges of the CLOCK signal. The ERROR signal may be single- 
ended or differential. If the ERROR signal is single-ended, line 222 is simply one line. 
Phase detector 220 also produces a REFERENCE signal on line 224 that can be subtracted 
from the ERROR signal to generate a data pattern independent correction signal. The 
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REFERENCE signal may be single-ended or differential. If the REFERENCE signal is 
single-ended, line 224 is simply one line. The ERROR and REFERENCE signals are filtered 
by the loop filter 240 resulting in a voltage VTUNE 245. 

As its name implies, the voltage controlled oscillator is an oscillator, the 
5 frequency of which is controlled by VTUNE. As VTUNE changes, so does the oscillation 
frequency. If the DATA on lines 205 and the CLOCK on lines 255 do not have the desired 
phase relationship, the error voltage, and thus VTUNE, changes in the direction necessary to 
adjust the VCO in order to correct the phase error. Specifically, if the DATA signal on lines 
205 comes too soon, that is, it is advanced in time relative to the CLOCK signal on lines 255, 
10 the phase detector increases the ERROR voltage on line 222. This results in a change in the 
: ^ VTUNE voltage 245 that increases the frequency of the CLOCK 255. As the frequency of 
ifi the CLOCK signal on lines 255 increases, its edges come sooner in time, that is they advance. 
m This in turn, brings its rising edges into alignment with transitions in the data signal on lines 
% 205. As the edges move into alignment, the error signal on line 222 reduces, changing 
VTUNE 245, thereby reducing the frequency of the CLOCK signal on lines 255. This 
. " feedback insurers that the DATA and CLOCK signals have the proper phase relationship for 
J;;* the retiming of the data by retiming block 210. In this condition the loop is said to be locked. 
1'^ Hence, these clock and data recovery circuits are often referred to as phase-locked loops, or 
PLLs. 

20^ The ERROR signal on line 222 and the REFERENCE signal on line 224 

provide a relatively low frequency, essentially differential, correction signal. This provides 
several important benefits. For example, the use of a REFERENCE signal gives context to 
the ERROR signal, reducing the data dependent phase errors which would otherwise result. 
If there are no data transitions this loop has no ERROR or REFERENCE signal information 

25 to use to lock, but since there is also no data to recover, this special case is of no interest. 

Also, conventional systems often employ what is known as a "bang-bang" 
phase detector. In bang-bang detectors, for each data edge, depending on its relation to the 
clock, a charge-up or charge-down signal is sent to a charge pump. Such detectors alternate 
between advancing and delaying the clock signal from the VCO, and never reach a stable 

30 point. Accordingly, bang-bang detectors always have a certain amount of systematic jitter. 
Moreover, these pulses have fast edges containing high frequency components that couple to 
the supply voltage and inject noise into other circuits. Reducing this noise requires either 
filtering, or using separate supply lines decoupled from each other. By using a low 
frequency, effectively differential signal out, the linear full-rate phase detector of the present 
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invention does not have this systematic jitter, and does not disturb the power supply and other 
circuits to the same extent. 

Figure 3 is a block diagram 300 of a phase detector consistent with one 
embodiment of the present invention. This phase detector may be used as the phase detector 
220 in figure 2. Alternately, it may be used in other PLL architectures. For example, it may 
be used in an architecture with a charge pump between the phase detector and loop filter. 
The phase detector shown may be used in a PLL in a fiber optic transceiver, as shown in 
figure 1. Alternately, it may be used in a PLL in other systems. Phase locked- loops are 
particularly important where a data processing system interfaces with a physical medium. 
Accordingly, this phase detector may be used in PLLs in twisted pair or coaxial transceivers, 
disk-drive or other mass-storage read channels, wireless receivers, routers, NICs, bridges, 
switches, hubs, and other similar circuits. 

Included in block diagram 300 are first flip-flop 310, a second flip-flop 350, 
third flip-flop 330, delay block 340, C2Q delay 320, and XOR gates 360 and 370. The flip- 
flops are negative-edge triggered devices. Specifically, the first flip-flop 310 and third flip- 
flop 350 change state on the falling edges of the clock, while the second flip-flop changes 
state on the rising edges of the clock. Alternately, positive-edge triggered devices may be 
used. If negative-edge triggered devices are used, the phase detector aligns the data 
transitions to the clock rising edges. If positive-edge triggered devices are used, the phase 
detector aligns the data transitions to the clock falling edges. All signal paths shown may be 
differential or single-ended. For example, Ql may be a differential signal including the first 
flip-flop 310 output signals Q and its complement, QBAR. In a preferred embodiment, all 
signal paths are differential. Using differential signals reduces the jitter caused by noise from 
such sources as the power supply and bias lines. Modifications to this block diagram will be 
readily apparent to one skilled in the art. For example, the third flip-flop 330 may be 
replaced with a matching delay element. 

DATA on line 305 is received by the first flip-flop 310 and C2Q delay block 
320. In a preferred embodiment, the delay through the C2Q delay block approximately 
equals the clock- to-Q delay of the first flip-flop 310. The clock- to-Q delay for a flip-flop is 
the delay of the output changing in response to a clock edge. The first flip-flop 310 is 
clocked by the CLOCK signal on lines 355 from a VCO or other oscillating circuit. On each 
CLOCK falling edge, the data on lines 305 is latched by the first flip-flop 310 and held at the 
Q output as signal Ql on line 315. The signal Ql on line 315 is stored in the third flip-flop 
330 on each falling edge of the CLOCK 355, delayed by the delay block 340, and applied as 
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an input to XOR gate 370. The output of the C2Q delay block 320, C2QX on line 323, is 
applied to the B input of XOR gate 370. The output of the XOR gate 370 is the ERROR 
signal on line 322. The output of the delay block 340, DEL on line 342, is stored in the 
second flip-flop on every CLOCK rising edge. The output of the third flip-flop 330, Q3 on 
5 line 335, is applied to the A input of XOR gate 360. The output of the second flip-flop 350, 
Q2 on line 356, is coupled to the B input of XOR gate 360. The output of XOR gate 360 is 
the REFERENCE signal on line 324. 

The signal delay duration provided by delay block 340 is greater than one-half 
a CLOCK cycle, less the clock-to-Q delay of the first flip-flop 310, plus the hold time of the 
10 second flip-flop 350. This duration is also less than one and one-half CLOCK cycles (three 

transitions of the clock), less the clock-to-Q delay of flip-flop 310, less the set-up time of the 
•;Q second flip-flop 350. The set-up time is the time that data must be present at a flip-flop's 
]-;§3 input before a clock signal edge to ensure that the data is properly clocked into the flip-flop. 
"J: The hold time is the time that data must be present at a flip-flop's input after a clock signal 
15^ edge to ensure that the data is properly clocked into the flip-flop. The delay through the 

delay block 340 decouples the signal path used to generate the REFERENCE signal on line 
.j;;f! 324 from the signal path used to generate the ERROR signal on line 322. Without the delay 
! i= * block 340, the output of the first flip-flop 310, Ql on line 315, would couple directly to the D 
q input of the second flip-flop 350. But this would mean the data signal would have to be 
2(T clocked out of the first flip-flop 310 and into the second flip-flop 350 in less than one-half a 
CLOCK cycle. This demanding timing requires using a great deal of power in both the first 
flip-flop 310 to reduce its clock-to-Q delay, and the second flip-flop 350 to reduce its set-up 
time. For some inherently slower technology, such as a standard CMOS process, it may 
simply be impossible to meet this timing requirement. With the addition of the delay block 
25 340, the most demanding timing path is from the output of the first flip-flop 310 into the third 
flip-flop 330. But there is an entire CLOCK cycle for this to occur, which is a much less 
stringent criteria. 

To improve performance, some circuit delay times and trace paths should be 
matched to each other. Specifically, the first flip-flop's clock-to-Q delay and the trace 
30 coupling the first flip-flop 3 1 0 to the XOR gate 370 should match the delay through the C2Q 
block 320 and the trace coupling the C2Q block 320 and the XOR gate 370. Also, the second 
flip-flop's clock-to-Q and the trace coupling the second flip-flop 350 to the XOR gate 360 
should match the third flip-flop's clock-to-Q delay and the trace coupling the third flip-flop 
330 to the XOR gate 360. By employing identical second and third flip-flops 350 and 330, 
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and identical XOR gates 360 and 370, one can easily achieve an almost perfect match 
between the two signals generating the REFERENCE signal. But it is more difficult to match 
the clock-to-Q delay of the first flip-flop 310 with a delay element such as the C2Q delay 
320. However, by decoupling the generation of the REFERENCE and ERROR signals, the 

5 difficult matching of the two signals producing the ERROR information can be independently 
adjusted and optimized. Better matching ensures that if the DATA signal transitions are 
aligned with the CLOCK rising edges, then the resulting ERROR and REFERENCE signal 
pulses match. To adjust these delays, one embodiment of the present invention has extra 
devices which may be configured as capacitors. These capacitors may be connected to a 

10 signal path in order to slow a signal down, such that it matches another signal more 

p accurately. For example, one embodiment has capacitors on the C2QX traces 323, so that the 
delay from the C2Q block 320 to the XOR gate 370 matches the delay from the first flip-flop 

s -0 3 10 to the XOR gate 370. 

,,p Figure 4 is a schematic for an exemplary circuit implementation of a negative- 

|S edge triggered flip-flop which may be used as the first flip-flop 310, the second flip-flop 350, 
or the third flip-flop 330 in figure 3. It will be obvious to one skilled in the art that other flip- 
f U flops can be used, for example a bipolar flip-flop could be used. Alternately, a flip-flop with 
j |=s current source loads, or source follower outputs could be used. The flip-flop is made up of 2 
W latches in series. Included are an input differential pair of the first latch Ml 410 and M2 415, 
20 latching devices M3 420 and M4 425, and CLOCK pair M9 450 and M10 455. Also 

included are the input differential pair of the second latch M5 430 and M6 435, latching pair 
M7 440 and M8 445, and CLOCK pair Ml 1 460 and M12 465. Load resistors Rl 485, R2 
490, R3 495, R4 497, current sources Ml 4 470 and Ml 5 480 are also shown. 

Bias voltage VCS is applied to the gates of M14 470 and Ml 5 480 relative to 
25 their sources, which are coupled to line 417. This bias voltage generates currents in the 

drains of M14 470 and Ml 5 480. When the CLOCK signal is high, that is the signal level of 
CLOCKP on line 409 is higher than the signal level of CLOCKN on line 411, the first latch is 
in the pass mode and the second latch is in the latched mode. Specifically, the drain current 
of M14 470 is passed through M9 450 to the input differential pair Ml 410 and M2 415, and 
30 the drain currents of M15 passes through device M16 465 to the latching pair M7 440 and 
M8 445. If the voltage at D is high, that is the voltage on line DP 402 is higher than the 
voltage DN on line 407, the drain current of M9 flows through device Ml 410 and into load 
resistor Rl 485, thereby lowering the voltage at the drain of Ml 410. The device M2 415 is 
be off, and so the voltage at its drain is high. If the voltage at QN on line 419 is high, the 
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drain current from Ml 2 465 passes through device M8 445 across the third load resistor R3 
495, and so QN remains high. 

When the CLOCK signal is low, that is the signal level of CLOCKN on line 
41 1 is lower than the signal CLOCKP on line 409, the drain current of Ml 4 470 passes 
5 through M10 455, and the drain current of M15 480 passes through device Ml 1 460. If the 
signal level at DP had previously been high such that the voltage at the drain of Ml 410 had 
been low, the drain current of Ml 0 455 passes through the device M3 420 across the load 
resistor 485, thus keeping the voltage at that node low. Furthermore the latch pair M7 440 
and M8 445 are off, and input pair M5 4 30 and M6 4 35 are on, and follow the data signal 
10 provided by latch pair M3 420 and M4 425. Therefore, for each CLOCK falling edge, that is 
r «=* when the voltage on line 411 exceeds in the voltage on line 409, the data at the input port DP 

and DN is latched by the first latch and output by the second latch on lines QP 417 and QN 
};q on line 419. 

! E If this flip-flop is used for the flip-flops in figure 3, the following should be 

1 15S3 

i& noted. If the signals are differential, DP, CLOCKP and QP correspond to the D, clock, and Q 
T ports of the flip-flops in figure 3. For the second flip-flop 350, the CLOCKP and CLOCKN 
jT! connections should be reversed relative to the other flip-flops, as indicated by the circle at its 
l J J clock input. If single-ended signals are used, DN and CLOCKN (CLOCKP for the second 
h flip-flop 350) should be coupled to bias voltages which preferably have a DC voltage equal to 
iff the average signal voltage at DP and CLOCKP (CLOCKN for the second flip-flop). This can 

be changed into a positive-edge triggered flip-flop by reversing the CLOCKP and CLOCKN 

lines. 

Figure 5 is a schematic of an exemplary circuit implementation for a delay 
circuit that may be used for delay block 340 in figure 3. This same architecture can be used 

25 to implement the C2Q block 320 in figure 3 as well. It will be obvious to one skilled in the 
art that this delay block could be designed several different ways. For example, an RC 
network could be used. Included are input pair devices Ml 530 and M2 540, cascode devices 
M3 510 and M4 520, load resistors Rl 560 and R2 570, and current source device M5 550. 
An input signal is applied at the A port, AP on line 535 and AN on line 545, to the first input 

30 pair Ml 530 and M2 540. A bias voltage VCS is applied to the gates of M5 relative to its 
source terminal that is coupled to line 507. VCS may be the same bias line as was used in 
figure 4. Alternately it may be a different bias voltage. This voltage generates a current in 
the drain of M5 550. If the voltage at the A port is high, that is the voltage on at signal AP on 
line 535 is higher than the signal level of AN on line 545, the drain current of M5 550 passes 
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through the device Ml 530, through cascode device M3 510, to the first load resistor Rl 560, 
pulling the voltage XN on line 555 low. Conversely, if the signal at the A port is low, that is 
the voltage signal at AP is lower than the signal level at AN, the drain current of M5 530 is 
passed through device M2 540, through cascode device M4 520, to the second load resistor 
5 R2 570, pulling output XP on line 557 low. In this way a signal applied to input port A on 
lines 535 and 545 results in a delayed signal appearing at lines at XP 557 and XN 555. 

Figure 6 and is an exemplary XOR gate that may be used with various 
embodiments of the present invention. For example, this XOR gate may be used as XOR 
gates 360 and 370 in figure 3. Alternately, other XOR gates may be used, such as a bipolar 
10 XOR gate. Included are B input buffers M9 605 and M10 610, and Ml 1 615 and M12 620, 

and A input buffer M7 675 and M8 680. An XOR core made up of devices Ml 630, M2 635, 
;3 M3 640, M4 645, M5 660, and M6 665, is also shown. Current sources M14 650, M15 655, 
m M16 670, and M17 685, are biased with a VCS voltage such that a current is produced in 
t! J their drains. The VCS voltage applied to all these devices may be equal to each other. 

Alternately, different VCS voltages may be used for the buffers and the core. Further, the 
■h " buffers may have differing VCS voltages. 

; : ;f| Signals at the A input steer the drain currents of Ml 6 670 through either M5 

i ^ 660 or M6 665. The signal at the B input steers the current to the load resistors thereby 

generating voltage outputs at QP and QN on lines 612 and 614. The connections are such 
Kf that QP is high when the signal at either, but not both, the A input and the B input are high. 

To match the delay from input to output, two buffers are used in the B path, and one buffer is 

used in the A path. This is because the A input steers the lower devices M5 and M6, which 

then drive upper devices Ml through M4. But the B input drives devices Ml to M4 directly. 

Thus, to compensate for the delay through M5 660 and M6 665, an extra buffer is inserted in 
25 the B path. Resistor R7 682 lowers the common mode voltage of the output of the A input 

buffer, which improves the transient response of the lower differential pair M5 660 and M6 

665. 

An alternate embodiment for an XOR gate can be found in commonly 
assigned U.S. patent application number Q^|~ifr3, filed February 1#, 2001, titled "Linear 
30 Half-Rate Phase Detector and Clock and Data Recovery Circuit," a ttorney docket-number 

- 6 1 974 - 7 - 0 04-jrl*OtJS', which is incorporated by reference. Also, other architectures which may 
be used to implement some of the circuits herein can be found in commonly assigned U.S. 
patent application number 09/484,856, filed January 18, 2000, titled "C 3 MOS Logic Family," 
attorney docket number 0 1971 7-0003 10US, which is incorporated also herein by reference. 
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Figure 7 is a timing diagram of some of the various signals in a phase detector 
consistent with one embodiment of the present convention, such as the block diagram of 
figure 3. This and the following timing diagrams are not limited to the circuit of figure 3 
however, and may be generated by other circuitry consistent with the present invention. 
Included are inputs CLOCK 710 and DATA 720, and resulting signals Ql 730, Q3 740, DEL 
750, Q2 760, ERROR 770, and REFERENCE 780. Data bits, such as 704 and 705, have a 
duration equal to one CLOCK cycle. Each data bit may be high or low, and the DATA signal 
720 may transition or remain constant from one bit to the next. 

Ql 730 is equal to the data signal 720 delayed in time and approximately 
aligned with the following falling edge of the CLOCK 710. There may be a delay between a 
transition of Ql 730 as compared to the falling edges of the CLOCK 710, particularly if Ql is 
generated by a flip-flop (or register) clocked by falling edges of the CLOCK signal 710 and 
having the data signal 720 as its D input. Q3 740 is equal to Ql 730 delayed by one CLOCK 
cycle. There may be a delay between a transition of Q3 740 as compared to the falling edge 
of the CLOCK 710, particularly if Q3 is generated by a flip-flop (or register) clocked by 
falling edges of the CLOCK signal 710 and having Ql 730 as its D input. The signal DEL 
750 is a delayed version of Ql 730. Q2 760 is equal to DEL 750 delayed and approximately 
aligned with the next rising edge of the CLOCK signal 710. There may be a delay between a 
transition of Q2 as compared to the rising edge of CLOCK 710, particularly if Q2 is 
generated by a flip-flop (or register) clocked by the rising edges of the CLOCK signal 710, 
and having DEL 750 as its D input. 

The DATA signal 720 may be delayed an amount approximately equal to the 
delay of signal Ql 730 as compared to the CLOCK 710. This delayed data signal is referred 
to as CPQX in this timing diagram. For ease of explanation, all clock-to-Q delays are 
represented as zero, and therefore, the signal CPQX is shown as being equal to the DATA 
input 720. ERROR signal 770 is generated by XORing CPQX and Ql 730. REFERENCE 
signal 780 is generated by XORing Q2 760 and Q3 740. 

For some time period after each falling edge of the CLOCK signal 710, the 
ERROR signal 770 is low. This is because at each falling edge of the CLOCK 710, Ql 730 
follows the data signal 720. Accordingly, for some time period following each CLOCK 
falling edge Ql 730 and data 720 are equal in value. For example, in the time prior to the 
ERROR pulse 712, both CPQX and Ql are in the state D2. Sometime later, the DATA 
signal 720 either transitions to a new level, or retains the same value. If DATA 720 changes 
to a new state, then DATA 720 and Ql 730 become unequal, and the ERROR signal 770 is 
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high. If data signal 720 retains its value, however, ERROR signal 720 remains low. 
Specifically, if data bits D2 and D3 are equal, then ERROR bit 712 is low. But if data bits 
D2 and D3 are not equal, then ERROR bit 712 is high. 

ERROR signal 770 is dependent on the phase relationship between DATA 720 

5 and CLOCK 710 in the following manner. For example, if data bit 704 is low and data bit 
705 is a high, then ERROR pulse 712 is high. If the DATA signal 720 advances, that is 
shifted to the left, then pulse 712 in the ERROR signal 770 widens (becomes longer in 
duration). If the DATA signal 720 is delayed, that is shifted to the right, then pulse 712 of 
ERROR signal 770 narrows (becomes shorter in duration). But note as above, if data pulse 

10 704 and data pulse 705 are equal, then data pulse 712 is low. Therefore, the average ERROR 
voltage is dependent not only on the phase error between CLOCK 710 and DATA 720, but 

:'S on the data pattern of DATA 720. For this reason, the ERROR signal 770 is most meaningful 

^ in the context of REFERENCE signal 780. 

v.p This is because the REFERENCE signal's average value is also data 

lj*L dependent. For some time period following each rising edge of CLOCK signal 710, the 
REFERENCE signal 780 is low, since at each rising edge of the CLOCK 710, Q2 760 is 

Li 

I'3 equal in value to Q3 740. For example, in the time prior before reference pulse 717, both Q3 
i,^ and Q2 are in the state D2. In the next half CLOCK cycle Q3 has the value of the next data 
;L: bit D3 while Q2 remains unchanged. Therefore, if the data bits D2 and D3 are equal then 
2(£ REFERENCE pulse 717 is low. But if data bits D2 and D3 are not equal, then REFERENCE 
bit 717 is high. 

For random data, each data bit may be high or low with equal probability and 
may change state or remain constant at each transition, also with equal probability. Thus, 
each ERROR pulse, such as 712, has an equal probability of being high or low. Also each 

25 REFERENCE signal pulse, such as 717, has an equal probability of being high or low. If the 
DATA transitions are aligned with the rising edge of the CLOCK 710, the ERROR signal 
770 and the REFERENCE signal 780 are each low half the time and either high or low with 
equal probability the other half. This means that the ERROR signal 770 and REFERENCE 
signal 780 each have an average AC value equal to one-fourth their AC peak value. 

30 If the data is not random, for instance if DATA 720 consists of a long string of 

either high or low data bits, then ERROR pulses, such as 712, and REFERENCE pulses, such 
as 717 are low. The ERROR and REFERENCE signals' average values are at a minimum. 
But if the data changes every bit, then each ERROR signal pulse and each REFERENCE bit 
is high. Therefore, the ERROR and REFERENCE signals are equal to one-half their peak 
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values. Thus, the ERROR signal and the REFERENCE signal have the same data pattern 
dependency, while the ERROR signal also tracks the phase error. This means the data 
dependency of ERROR signal 770 can be corrected by subtracting the REFERENCE signal 
780. The difference signal between ERROR and REFERENCE is not dependent on the data 
5 pattern, but is dependent on the phase error. This resulting signal has approximately a zero 
value when the edges of the DATA signal are aligned with the CLOCK rising edges. As the 
DATA is delayed, the differential value becomes negative. As the DATA advances, the 
difference becomes positive. 

Each data bit has a duration ti 743. The reciprocal of the data bit duration ti 
10 743 is referred to as the data rate. Each CLOCK period has a duration t 2 747, where t 2 is 

equal to ti. The CLOCK frequency is the reciprocal of the duration t 2 747. Thus, the 
j £ CLOCK frequency is equal to the data rate. 

! -J Various modifications will be obvious to one skilled in the art. For example, a 

;t g CLOCK signal with a reversed polarity may be used, such that the transitions of the data 
1JT align with the CLOCK falling edges. 

iiP Figure 8 is the timing diagram of figure 7 for a specific data transition 805. 

Included are inputs CLOCK 810 and DATA 820, and resulting signals Ql 830, Q3 840, DEL 
I'U 850, Q2 860, ERROR 870, and REFERENCE 880. In this example, DATA 820 transition 
i jl 805 occurs at a time corresponding to the rising edge 802 of CLOCK signal 810. Ql is equal 
2<j}J to the DATA signal shifted in time and aligned with the next falling edge of the CLOCK 810. 
Q3 is equal to Ql delayed by one CLOCK cycle. Del 850 is Ql 830 delayed in time. 
Ignoring any clock-to-Q or set-up and hold times, this delay is between one-half a CLOCK 
cycle and one and one-half CLOCK cycles. This range is shown by times ti 835 and t 2 845. 
If DEL 850 follows Ql 830 either too closely or too remotely, the second flip-flop 350 
25 latches the DEL signal on the wrong rising edge of the clock. As above, if the signals are 
generated by flip-flops, the delay between DEL 850 and Ql 830 is greater than one-half a 
CLOCK cycle, less a clock-to-Q delay, plus a hold time, but less than one and one-half 
CLOCK cycles, less a clock-to-Q delay, less a set-up time. 

Q2 860 is equal to DEL 850 delayed and aligned with the next rising edge of 
30 the CLOCK signal 810. Again, the DATA signal may be delayed by a time approximately 

equal to the phase delay between Ql and the falling edge of the CLOCK signal 810, resulting 
in the signal CPQX. The ERROR signal 870 is the XOR of CPQX and Ql 830. In some 
applications, the DATA signal may not need to be delayed, and the DATA signal itself may 
be XORed with Ql to generate the ERROR signal. The REFERENCE signal is the XOR 
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between Q2 860 and Q3 840. As can be seen in this diagram, an ERROR pulse 815 and a 
REFERENCE pulse 825 result from the data transition 805. 

Figure 9 is the timing diagram of figure 8 with a phase error t 3 introduced 
between the data transition 905 and CLOCK rising edge 902. Included are inputs CLOCK 
5 910 and DATA 920, and resulting signals Ql 930, Q3 940, DEL 950, Q2 960, ERROR 970, 
and REFERENCE 980. Again, the transition 905 in DATA 920 results in a pulse in ERROR 
waveform 970, specifically 915, and a REFERENCE bit 925. But this time, since the DATA 
920 has been delayed, ERROR pulse 915 is narrower than the corresponding pulse 815 in 
figure 8. Specifically, ERROR pulse 915 is narrower by an amount shown here as U 917. In 
10 most cases, t 4 is approximately equal to t 3 . Accordingly, the average value of ERROR signal 

970 is lower than the average value of ERROR signal 870 in figure 8. But again, since the 
i'S REFERENCE pulse 925 is defined by the falling and rising edges of the CLOCK signal 910, 
v£ its width does not change as compared to REFERENCE pulse 825 in figure 8. Therefore, the 
"P difference between the ERROR signal and the REFERENCE signal has changed, and this 
1 5U= difference signal is used to correct for the phase error between DATA transitions such as 905 
-f and the rising edges of the CLOCK 910. 

Q Figure 10 graphs the ERROR voltage and REFERENCE voltage outputs for a 

full-rate phase detector consistent with one embodiment of the present invention. The 
"J;tJ ERROR signal 1010 and REFERENCE signal 1020 voltages are graphed as a function of the 
20^ phase error between the data and CLOCK signals. ERROR signal 1010 is proportional to the 
phase error. ERROR signal 1010 may be linear. Alternately, ERROR signal may have non- 
linear characteristics. REFERENCE signal 1020 is approximately independent of the phase 
error, but is a function of the data pattern. ERROR signal 1010 and REFERENCE signal 
1020 may become discontinuous or notched when the phase error is near plus or minus 180 
25 degrees. 

Figure 1 1 is a flow chart for a method detecting phase errors between a data 
signal and clock signal, consistent with one embodiment of the present invention. In act 
1 1 10, a data input and a clock input having rising and falling edges is provided. The data 
input is stored in a first flip-flop on the clock falling edges in act 1 120. In act 1 130 the first 
30 flip-flop's output is stored in a third flip-flop on the clock falling edges. The first flip-flop's 
output is delayed in act 1 140, and this delayed output is stored in a second flip-flop on the 
clock rising edges. The data signal and the first flip-flop's output are XORed to generate an 
error signal in act 1 160. In act 1 170 the second and third flip-flop's outputs are XORed to 
generate a reference signal. 
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It will be obvious to one skilled in the art, that various modifications and 
additions can be made to this flow chart. For example in generating the error signal, the data 
signal may be delayed as to match the first flip-flop's clock -to-Q delay. Also, the error and 
reference signals may be applied to a charge pump, or directly to a loop filter in order to 
generate a VCO control voltage. 

Embodiments of the present invention have been explained with reference to 
particular examples and figures. Other embodiments will be apparent to those of ordinary 
skill in the art. Therefore, it is not intended that this invention be limited except as indicated 
by the claims. 
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